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Abstract 

Our main research interest is optimizing n-input sorting networks — a mathematical 
object oblivious to the order of the input data which always performs the same set of 
pre-determined operations to produce a sorted list of n numbers. In the early 2000's a 
sorting/comparator network isomorphism and normalization is presented by Choi and 
Moon. This is used to substantially reduce the search space for optimal n-input sorting 
networks by considering only representative networks up to the isomorphism. Choi and 
Moon prove the computational complexity of checking whether two n-input networks 
are isomorphic to be polynomially reducible to the bounded valence graph isomorphism 
(GI) problem. In 2013 , Bundala and Zavodny described a new sorting network relation 
(subitemset isomorphism) which is 'superior' to that of Choi and Moon — any networks 
that are CM (Choi and Moon) isomorphic are also BZ (Bundala and Zavodny) subitemset 
isomorphic but the converse is not true in the general case. Bundala and Zavodny's 
sorting network subitemset isomorphism drastically reduces the search space for optimal 
sorting networks in comparison to previous methods. Their (BZ) isomorphism is at the 
core of their computer-assisted proof for depth optimality of n-input sorting networks 
for 11 < n < 16 and also Codish et al's comparator optimality computer-assisted proof 
for nine and ten-input sorting networks. 

The subitemset isomorphism problem is really important and there are excellent prac¬ 
tical solutions described in the literature. However, the computational complexity anal¬ 
ysis and classification of the BZ subitemset isomorphism problem is currently an open 
problem. In this paper we prove that checking whether two sorting networks are BZ 
isomorphic to each other is Gl-Complete; the general GI (Graph Isomorphism) problem 
is known to be in NP and LWPP, but widely believed to be neither P nor NP-Complete; 
recent research suggests that the problem is in QP. Moreover, we state the BZ sorting 
network isomorphism problem as a general isomorphism problem on itemsets — be¬ 
cause every sorting network is represented by Bundala and Zavodny as an itemset. The 
complexity classification presented in this paper applies sorting networks, as well as the 
general itemset isomorphism problem. The main consequence of our work is that cur¬ 
rently no polynomial-time algorithm exists for solving the BZ sorting network subitem¬ 
set isomorphism problem; however the CM sorting network isomorphism problem can 
be efficiently solved in polynomial time. 
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1. Introduction 


1.1. Structure 

We have structured the presentation of our work in the following manner. First, we 
give a brief introduction to the sorting network optimization problem to describe one 
real world instance of the (generic) problem tackled in this paper. Next, we give a formal 
description of the Subitemset Isomorphism (SI) problem together with necessary termi¬ 
nology. Then, we give a summary of the related work on the complexity classification 
of the problem. In Section |4j we present our main result by proving the Itemset Isomor¬ 
phism (II) problem is Gl-Complete; an immediate consequence is that the SI problem is 


GI-Hard. In Appendix A and Appendix B we present a set of examples that illustrate 
main points of the rather technical complexity classification proofs. We conclude by pre¬ 
senting a brief summary of our work and discuss possibilities for future contributions. 


1.2. Terminology 

We now precede with the formal definitions all of the mathematical objects that are 
used throughout this paper. Visual examples of all object types are presented in Figure [lj 
Unless otherwise stated, we assume to be working in the domain D = {d\. d, 2 ..... d n } of 
n distinct elements. 


• item — a set of elements over the domain D. We represent an item / as a binary 
string of length n where the ?’-th bit is equal to 1 iff the element d, e I for all 1 < i < 
n; i.e. / C {0, l} n . See Figure |l(a)| for examples of items. 

• itemset — a set of items over the domain D. We represent an itemset S' as a matrix 
with |S| rows and n columns over the field {0,1}. See Figure [l(bj| for examples of 
itemsets. 

• dataset — an ordered set of itemsets by cardinality in ascending order over the do¬ 
main D. See Figure |l(c)| for examples of datasets. 

We have chosen the labels of the objects to match that of itemset mining algorithms |[Tj| 
010111 because the extremal sets identification problem is a sub-problem of the main 
task. Codish et al. 0 (6| describe a subset of our problem as 'words up to permutations' 
instead of the generalization of 'itemsets up to subitemset isomorphism'. We consider 
the choice of naming objects to be personal preference, because all that is important is the 
mathematical structure of the object that we work with, not the labels used. Flence, we 
are as rigorous as possible in this Section [L2| when it comes to defining the objects. 
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a = {d 2 ,d 3 ,d 4 ,d 6 } = 0111010 
b = {d 1; d 4 } =1001000 

c = {di.ds} =1000100 
d = {d 1 ,d 2 ,d 5 ,d 7 } = 1100101 


Q 

d 3 d 2 d 3 d 4 d 5 d 6 d 7 
b| 1 0 0 1 0 0 0 

dllOO 101 



d x d 2 d 3 d 4 d 5 d 6 d 
0 111 0 10 
1 0 0 0 1 0 0 
110 0 10 1 
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(a) Item — a set of ele¬ 
ments over the domain 
D = {d 7 , d 2 , d 3 , G?4, 7^5, dg, C?7 }. 

The items a, b, c and d are pre¬ 
sented. We can always represent 
a set over a domain hasa binary 
string of length \D\ where the z-th 
bit equals one iff the element di is 
contained in the set. 


(b) ltemset — set of items over the domain D. The two 
itemsets S = { b. d} and T = {a,c,d} over the domain 
D = {d |. d/>. d.j. d 4 , d-, . di,, d- \ are presented. Remember that 
there are no duplicate items within an itemset. 




Q 

' di d 2 d 3 d 4 d 5 d 6 d 7 
b'1001 000 

di 1 1 0 0 101 
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T d 4 d 2 d 3 d 4 d 5 d 6 d 7 


a. 0 111 0 1 0 

C1000 100 

dllOO 101 



(c) Dataset — ordered set of itemsets over the domain D. The dataset 
F = ( S , T) over the domain D = {d 3 , d 2 , d 3 , d±, d 5 , d 6 , d 7 } is presented. 
Remember that the itemsets within a dataset are ordered increasingly by 
cardinality. 


Figure 1: Graphical representation of the mathematical objects that are used throughout the paper — item, 
itemset and dataset. For all of the examples in this figure, we use a dataset D = {di, d 2 , ■ ■ ., d 7 } of seven 
elements. For a formal definition, please refer to Section [T2| 

1.3. Operations 

Having defined all of the necessary terminology (Section |1.2[ ) we now formally put 
forward the respective operations that we investigate in this paper. 

Definition 1.1. Let S and T be itemsets over the domains D s and D T/ respectively. We say that 
S is isomorphic to T iff there exists a bijection J : D s —> D T such that J(S) = T, also written 
as S = T; where J(S ) = {{J(d) | d e 1} \ I G S}. If D s = D T then we refer to J as an 
automorphism. 

Remark 1.2. Let S = {th, S 2 ,..., S k } and T = {Tf, T 2 ,..., T k } be itemsets over the domains 
D s and D T such that S is isomorphic to T given by J(S) = T. Since S and T are sets there exists 
a bijection a : {1,2,... k} —* {1,2,... k} which maps the items from S to the items in T such 
that WSi e S we have J(Si) = T a ^. 

Definition 1.3. Let S and T be itemsets over the same domain D. We say that S is subset ofT 
up to isomorphism iff there exists a bijective J : D —* D such that J(S) C T, also written as 
S^T. 
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1.4. The Problems of Interest 

Definition 1.4. Itemset Isomorphism (II) decision problem: 

Input: Two itemsets S and T over the domains D s and I) T/ respectively. 

Question: Is there a bijection J : D s —» D T s.t. J(S ) = T? 

Definition 1.5. Subitemset Isomorphism (SI) decision problem: 

Input: Two itemsets S and T over the domain D. 

Question: Is there a bijection J : D —s- D s.t. J(S) C T? 

What is the worst-case complexity class of any algorithm for solving the II and SI 
problems? 

1.5. Contributions 

The main contributions our work can be summarized as follows. 

• Itemset Isomorphism: Gl-Complete — In Section [4] we present a proof that the item- 
set isomorphism decision problem (equality up to bijection of itemsets) is exactly 
as difficult as the Graph Isomorphism decision problem. The problem is of great 
importance in the sorting network optimization domain 0. 

• Subitemset Isomorphism: Gl-Hard — As an immediate consequence, the problem 
of finding a class representative itemsets up to subitemset isomorphism within a 
dataset is GI-Hard, that is at least as hard as GI. This problem has been encoun¬ 
tered before in recent research HU 0 0 @ El in the sorting networks optimization 
domain, but its worst case computational complexity has never been classified. 

2. Motivation: Sorting Network Optimization 

2 . 1 . Preliminaries 

A sorting network is a mathematical object consisting of exactly n wires and compara¬ 
tors designed to sort an input of n numbers. Sorting networks are oblivious to the order 
of the input data and always perform the same set of pre-determined operations to pro¬ 
duce a sorted list of n numbers. The problem of finding optimal sorting networks was 
first proposed lH0l by Bose and Nelson more than 50 years ago. There are two common 
measures for the optimality of a sorting network — number of levels (depth) and number 
of comparators. The problem studied in this paper is central |8l 0 0 0 to both sorting 
network optimization problems. 
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2.2. The Key Concept 

We now restate fiZj a key idea in optimizing sorting networks. First, we note that any 
comparator network can be represented as the itemset of outputs when the network is 
applied to all possible inputs. The number of possible comparator networks is exponen¬ 
tial in the number of channels. Therefore the enormous search space for optimal sorting 
networks naturally gives rise to the concept of subitemset isomorphism — denoted as A. 
More specifically, when searching for optimal sorting networks, we can discard compara¬ 
tor networks whose itemset representation is not minimal up to A; i.e. given a dataset D, 
we need to find a class representative itemsets up to A within D that are of minimal car¬ 
dinality. This idea is at the core of recent computer-assisted proves for the level-optimal 
//-input sorting networks for 11 < n < 17 U3 [7] BH1 [9] IflTI : an algorithm using the same 
idea was built (6j to provide computer-assisted proof for the comparator-optimal //-input 
sorting networks for 10 < n < 11. 

Remark 2.1. In this Section [2j we omit some important details, due to space and topic limitations, 
like the exact form/shape of the dataset D where we are allowed to make such search space reduction 
in the sorting networks optimization domain. For more information on this topic, we refer the 
reader to the excellent papers © /1ZP /E? ® by Bundala, Zavodny and Codish et al. They were 
the first to put forward the idea of subitemset isomorphism (in the sorting netzvorks optimization 
domain), although not in such a general context as presented in this paper. 

3. Related Work 

3.1. Complexity Analysis of CM iil2if Sorting Netzvork Isomorphism 

Choi and Moon 11121 describe a sorting network isomorphism and present an algo¬ 
rithm aimed at reducing the search space in sorting networks optimization. They show 
that their (CM) isomorphism is polynomial-time equivalent to the Graph Isomorphism 
problem of bounded valance. The GI problem of bounded valance can be solved effi¬ 
ciently |fl3f in polynomial time. 

The work of Choi and Moon is an inspiration to our work because we examine a 
stronger (BZ f7J) isomorphism of sorting networks than the CM isomorphism. We prove 
that the BZ itemset isomorphism problem is polynomial-time equivalent to the (generic) 
version of the Graph Isomorphism problem. Moreover, we show that the BZ subitemset 
isomorphism problem is GI-Hard. 

3.2. Knozvn Algorithms for the Subitemset Isomorphism Problem 

Given a dataset F, the relation A induces a partial order on F. To optimize the search 
for optimal sorting networks it is enough |[6) (Section 3) for one to consider only the min¬ 
imal representative itemsets within F up to A. 

Multiple HT4H Il6i [0, very fast deterministic practical algorithms for finding a class 
representative of minimal up to A itemsets within a dataset D exist in the literature. 
However, the worst case complexity of all these known algorithms is exponential in n 
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— the size of the domain/alphabet, as defined in Section 1.2 



In this paper, we narrow the gap between theory and practice by formally proving 
that the SI decision problem is at least as difficult as the Graph Isomorphism (GI) decision 
problem. Moreover, we show that the Itemset Isomorphism problem is Gl-Complete; that 
is polynomial-time equivalent to the Graph Isomorphism (GI) problem. 

3.3. Complexity Analysis of Graph Isomorphism 

The graph isomorphism problem is one of two listed ffT5l by Garey and Johnson that 
is yet to be classified. The possible complexity classes include but are not limited to: P, 
NP-Complete, QP (as very recent research suggests by L. Babai). Over the years, there is 
substantial research on the GI problem: fast practical algorithms with or without domain 
restrictions m m mil / complexity analysis HU |20l BZQ |[13|, Gl-Complete problems 
|[22| Il23i , etc. More importantly, it is commonly believed that Gl-Complete problems 
form a uniquely defined complexity class that sits between P and NP-Complete, but this 
is yet to be proved. In this paper, we use the fact that the Hypergraph Isomorphism 
(HGI) decision problem is polynomial-time equivalent Il23l to the Graph Isomorphism 
(GI) problem. 

3.4. Itemset Mining 

It is worth noting that the itemset terminology used throughout this paper is widely 
used in the context of database mining and frequent itemset mining. The graph isomor¬ 
phism problem correlation to itemset mining is evident in the work of Juan et al. |24| who 
investigate the subgraph mining problem. Another data mining example is the work of 
Nuyoshi et al. |i25j| who use quantitative itemset mining techniques to mine frequent 
graph patterns. 

Lastly, we note that the problem of finding minimal itemsets within a dataset up to 
subitemset isomorphism is a generalization of the extremal sets problem I1T411 : where in 
the extremal sets problem relabelling of domain elements is not permitted. The problem 
of finding extremal sets Q 13 BU has received large attention in recent years. Moreover, 
practical algorithms Ifl4ll for finding minimal itemsets up to subitemset isomorphism 
within a dataset, first find the minimal itemsets within the dataset to reduce the search 
space. 


4. Complexity Classification 

In this section we first formally define the Graph Isomorphism (GI) decision problem 
m m- We then precede with an informal discussion of how the GI and II problem 
"differ". In Section |4~2| we show that GI <p II and then in Section |4.3| we show that 
II < P GI. Following is a natural deduction that II is Gl-Complete and also the natural 
consequence of the GI-Hardness of SI (GI < P SI). 


Definition 4.1. Graph Isomorphism (GI) decision problem: 

Input: Two undirected graphs G = (V Gl Eg) and H = (V H , E H ). 

Question: Is therea bijection I : V G —> V H s.t. (v,w) E E G iff (I (v), I (w)) E E H ? 
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12 3 4 

12 3 4 

1 

0 111 

1 

0 10 1 

2 

10 0 1 

2 

10 0 1 

3 

10 0 0 

3 

0 0 0 1 

4 

110 0 

4 

1110 


Figure 2: This figure presents a graph G and its adjacency matrix. We also present a swap of the vertices 
1 and 4 of G to obtain H. To construct the adjacency matrix of H from the adjacency matrix of G, we first 
swap the rows 1 and 4 of G and then swap the columns 1 and 4. Since, every permutation (of the vertices 
of G) can be written as a sequence of swaps, this figure shows the methodology of applying a permutation 
to a graph. 


4.1. Discussion 

Before presenting a rather technical proof that the II decision problem is Gl-Complete, 
we give a brief discussion on how the GI and SI problems "differ". Intuitively, the two 
problems are very similar as the inputs to both problems can be represented as zero-one 
matrices — see Figures [2] and [3} however, there are two fundamental differences. 

• In the GI problem a swap of vertices is represented as a swap of two rows and two 
columns of the zero-one adjacency matrix (Figure [2|, whereas in the II problem a 
swap of two domain elements is represented as a swap of two columns (Figure [3]> 
with the rows left intact. 

• A valid solution for the GI problem requires the two zero-one adjacency matrices to 
match exactly. Whereas, in the II problem any reordering of the rows is permitted 
(recall Remark |l.2| ). 


4.2. II is GI-Hard 

Lemma 4.2. GI <p II. 


The proof of Lemma |4.2| is a rather technical one. However, the proof is constructive, 
and we present examples in Figures |A.4 and A.5 for the essential steps of the proof. A 


detailed explanation of the examples following the steps of the proof of Lemma 4.2 


presented in Appendix A 


is 
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s 


J : D -> D 




di 

d. 

d? 

d a 

d x -> d 4 


di 

d. 

d 3 

d 4 

1 

1 

1 

1 

0 

d 2 -> d 2 

1 

0 

1 

1 

1 

2 

1 

0 

0 

1 

d 3 -> d 3 

2 

1 

0 

0 

1 

3 

1 

1 

0 

0 

d 4 -> d 3 

3 

0 

1 

0 

1 

4 

0 

0 

1 

1 


4 

1 

0 

1 

0 


Figure 3: This figure presents an itemset S over the domain I) = {d\. cJ 2 . d-i . d ,\} and its matrix representa¬ 
tion (as described in Section 1.2 and Figure|l(b)||. We also present a swap of the domain elements d\ and di 
of S to obtain T. To construct the matrix representation of T from the matrix of S, we need to swap the two 
columns d t and d,\. Since, every permutation (of the domain D) can be written as a sequence of swaps, this 
figure shows the methodology of applying a permutation to an itemset. 


Proof. Define the function / : (G, H) = (S, T) where ( G, H) is input to GI and (S, T) is an 
input to II. The itemset S = {.S’,, | u G V G } where the items S u = {(w, v) G E G j v G V G }. 
Similarly, the itemset T = {T h \ h G Vh} where the items T h = {( h,w ) G E H \ w G Vh}- 
We now show that the function / is a polynomial-time reduction of Graph-Isomorphism 
to Itemset-Isomorphism. 

First, we need to show that the function / is a polynomial time one. It is obvious, 
that this is the case, because / does no computation and simply, re-structures the input. 
Flence, the reduction function / is polynomial time. 

To prove that the presented polynomial-time reduction is correct, we need to show 
that a Graph-Isomorphism instance is satisfiable (yes instance), if and only if the /- 
induced Itemset-Isomorphism instance is satisfiable. 

Suppose that the Graph-Isomorphism instance is satisfiable: there exists a bijection / : 
V G —* Vh s.t. (v,w) G E g iff (I(v),I(w)) G E H . We claim that J : (v, w) —* (. I(y),I(w )) 
satisfies J(S ) = T. To see this, consider any item S g = {{g,x) G E G \ x G V G } G S and 
apply the bijection J to it. Then clearly we have J(S g ) = {(1(g), I(x)) G Vh \ x G V G } = 
{(I(g),y) G V H | y G V H } = Tf( g ) G T. Also note that, since / is bijective then J -1 exists 
and we can similarly show that for any T h G T we have I~ l (T h ) = £ S . Hence, 

we have shown that, if any Graph-Isomorphism instance (G, H) is satisfiable then the 
created Itemset-Isomorphism instance f((G, H)) is satisfiable. 

Now suppose that the created Itemset-Isomorphism instance f((G,H)) = (S,T) is 
satisfiable: there is a bijection J : E G —» E H s.t. J(S) = T. By Definition 1.1 of itemset 
isomorphism and Remark 1.2 we know there exists a bijection a that maps the items in 
J(S) to the items in T. Hence, a : V —> H is such that for any S g G S we have J(S g ) = 
T(r(g) G T, and vice versa. We claim that a gives a graph isomorphism from G to H. To see 
this, notice that for all (v, w ) G E G we have (cr(v), a(w )) = J(( v, w)) (we are working with 
undirected graphs). But, from the assumption we know that J((v,w )) G E H ; to go from 
E h to E g is systematically the same because a is bijective, hence a~ 1 exists. Therefore, 
we have shown that if the created Itemset-Isomorphism instance f((G,H )) = ( S,T) is 
satisfiable then the original Graph-Isomorphism instance (G, II) is satisfiable. □ 
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4.3. GI is II-Hard 

In order to prove that there is polynomial-time reduction from the II decision problem 
to the GI decision problem, we require an intermediate result. Namely, we are going to 
show that the Hypergraph Isomorphism (HGI, see the following Definition |4.3| decision 
problem is II-Hard; in other words, we will show that HGI is at least as hard (up to a 
polynomial transformation) as II. We also use the known fact that HGI is polynomial¬ 


time equivalent to GI, formally stated in Theorem 4.4 


We refer to a hypergraph G = (V G . E c ) as a combinatorial object consisting of nodes 
V G and edges E G except that the edges consist of an arbitrary number of nodes within V G 


. Examples of hypergraphs are given in Appendix B 


Definition 4.3. Hypergraph Isomorphism (HGI) decision problem: 

Input: Two undirected hypergraphs G = (V G , E G ) and H = (V H , E H ). 
Question: Is there a bijection I : V G —)• Vh s.t. A e E G iff 1(A) e E H ? 
where A C V G and 1(A) = {I(v) \ v e A}. 

Theorem 4.4. HGI is Gl-Complete. 


Proof The statement of Theorem 4A]is very well known in the community |27j. We refer 
the reader to 1231. □ 


To summarize our overall strategy, we prove that II <p HGI but since HGI is GI- 
Complete, we deduce that GI is II-Hard. 

Lemma 4.5. II <p HGI. 


The proof of Lemma 4.5 is very similar to the proof of Lemma 4.2 from Section 4.2 


However, there is a complication: namely, given an instance (S, T) of the II problem such 
that either S or T contain a column with total number of ones not equal to two, in the 
matrix representation (as described in Section |T2| and Figure 1(b)). This situation requires 
a slightly different mechanism to the one described in the proof of Lemma |4~2j 

In the following proof of Lemma 4.5 we define a polynomial-time reduction function 
g : (S, T) —» ( G , H) from the II problem to the HGI problem. The natural extension 


of our machinery (proof of Lemma 4.2 > to solve the complication is to think of the ones 
in any column of the matrix representation of an itemset as defining a hyperedge in a 
hypergraph. As an example, consider the itemset S in Figure [3j then the ^-corresponding 
hyperedge for the column cf would be the set of nodes {1, 2, 3}, and the hyperedge for 
the column d 2 would be the set of nodes {1,3} — this is a normal edge only because the 
cardinality of the hyperedge is 2. We give examples of the following constructive proof 
in 


Appendix B 


Proof Let us start by formally defining the function g : {S, T) —» (G. H) over the 
itemsets S = and T = {T h } over the domains Ds = {si, s 2 , ..., s n } and D T = 
{ti,h, ■■■An} respectively. We set G = (V G , E G ) and H = (V H , E H ) s.t. V G = {g \ S g e S'}, 
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E s = {g | s G S g : S g G 5}, E G = {E s \se D s } and V H = {h \ T h G T}, E t = {h\teT h : 

T h G T}, E h = | t G D T }. We claim that g is a polynomial-time reduction of II to HGI. 

It is clear that g can be implemented in a polynomial time, since g performs no cal¬ 
culation and transforms the itemsets S and T to the hypergraphs G and H respectively; 
where \V G \ = |S|, \E G \ = \D S \, \V H \ = \T\ and \E H \ = \D T \. 

It is now left to show that the Itemset Isomorphism instance (S, T) is satisfiable (yes 
instance) if and only if the ^-induced Hypergraph Isomorphism instance (G. H) is satis¬ 
fiable. 

Suppose that the Itemset Isomorphism instance (S, T) is satisfiable: there exists a In¬ 
jection J : Ds —* Dt s.t. J{S) = T. Using Definition 1.1 Remark 1.2 and the con¬ 
struction of g, we deduce there exists a bijection o : V G —> V H s.t. S g G S if and only 
if J(S g ) = T CT ( ff ) G T. We claim that the bijection a gives a hypergraph isomorphism be¬ 
tween G and H. To prove our claim, consider any E s — {g \ s G S g : S g G S'} G E G . 
However, applying the bijections a and J to the set of vertices E s (hyperedge) is equiv¬ 
alent to {a(g) | J(s ) G J(S g ) : J(S g ) G J(S)}; simplifying (by setting h = a(g ) and 
J(S g ) = T h ) gives us {h \ t G T h : T h G T} = Ej^ = E h . Since, a and J are bijective, 
we deduce that a gives a hypergraph isomorphism between G and H. Hence, we have 
shown that if the Itemset Isomorphism instance (S, T) is satisfiable then the Hypergraph 
Isomorphism instance g{(S,T)) = (G, H) is also satisfiable. 

Suppose that the Hypergraph Isomorphism instance g((S , T)) = (G, H) is satisfiable: 
there exists a bijection / : V G —» V H s.t. E s G E G if and only if I(E S ) G E H . However, 
since E H = {E t \ t G D T } then we can define a bijection 7 : D s —» D T such that s 1 — > t 
if and only if I(E S ) = E t . We claim that 7 gives an Itemset Isomorphism between S and 
T. This is trivially seen because 7 (S g ) = T/( s ) for all g G V G and due to 7 and / being 
bijections. □ 


Lemma 4.6. II <p GI. 

Proof. Immediate consequence of applying Theorem |4.4| and Lemma [475} 


□ 


4.4. II is Gl-Complete 

To complete our proof that II is Gl-Complete, we still need to show that II G NR 
Lemma 4.7. II G NP. 


Proof. We need to show that a polynomial time verifier of the Itemset-Isomorphism prob¬ 
lem exists to conclude that II is in NP. It is easy to see that, given a bijection J the verifier 
needs to check if J(S ) = T. Clearly the application of the bijection J to S can be done 
in polynomial time. The equality checking can be done in polynomial time because J(S) 
and T are sets of a polynomial number of elements each. □ 

Theorem 4.8. II is Gl-Complete. 

Proof. Follows immediately by applying Lemmas |4.2[|4.6| and |4.7[ □ 

Corollary 4.9. SI is GI-Hard. 
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Proof. An immediate consequence to Theorem 4.8 because obviously II <p SI and SI e 
NP. □ 

Furthermore, we deduce that finding a class representative 0 0 0 up to subitemset 
isomorphism is GI-Hard — clearly polynomial-time reducible to the SI problem. 


5. Conclusion and Future Work 

Fast algorithms for the Subitemset Isomorphism (SI) problem are of practical im¬ 
portance in the sorting networks optimization domain. The SI problem is encountered 
in recent 0 HU 0 0 breakthrough sorting networks optimization research however 
its worst-case computational complexity classification is an open problem. This cur¬ 
rent paper proves the Itemset Isomorphism (II) decision problem to be Gl-Complete; 
polynomial-time equivalent to the Graph Isomorphism (GI) decision problem. As a corol¬ 
lary, the SI problem is shown to be GI-Hard. The complexity analysis presented here is 
of importance to research aimed at fast practical algorithms 0 [fl4)| for the SI problem, 
as well as, extending the list Il22ll of Gl-Complete problems which are of practical impor¬ 
tance too. 

For future work, we aim to classify the SI problem more precisely, rather than the 
lower complexity bound given here. We suspect, that the problem of Subitemset Iso¬ 
morphism is NP-Complete. The reason, there is an intuitive relation between the pair of 
Graph-Isomorphism (Gl-Complete) and the Subgraph-Isomorphism (NP-Complete) and 
the pair of Itemset-Isomorphism (Gl-Complete) and Subitemset-Isomorphism (GI-Hard). 
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Figure A.4: An example of two isomorphic graphs G and H together with the corresponding isomorphic 
itemsets S and T generated by the polynomial-time reduction function / : (G, H) = (S. T ), as described in 


the proof of Lemma |4.2| This figure serves as a detailed example of the constructive proof to Lemma 4.2 


In the figure we see that, there is a unique isomorphism between G and H, given by /; and a unique 


isomorphism J between S and T. For detailed explanation of this figure refer to Section Appendix A.l 


Appendix A. Examples: II is GI-Hard 


The examples presented in Figures A.4| and |A.5| demonstrate how to apply the polynomial¬ 
time transformation function / (defined in the proof of Lemma [4~2] ) to an instance (G. H) 
of the GI problem to produce an instance (S, T) of the II problem. From the examples, it 
is clear that (G, H) is satisfiable if and only if (S, T) is satisfiable. 

Following the proof of Lemma 4.2 and the two Figures |A.4 and A.5[ we see exactly 
how to construct J using /, and vice versa; where / gives the graph isomorphism be¬ 
tween G and II, and J gives the itemset isomorphism between S and T. Note, that 
Lemma [472] works only for undirected graphs; a more technical proof is required for the 
case of directed graphs but is not necessary for the complexity classification of the itemset 
isomorphism problem. 


Appendix A.l. Figure A.4 


The example presented in Figure A.4 shows two isomorphic graphs and their /- 
corresponding isomorphic itemsets; recall / from proof of Lemma |A2 It is clear that, 
the two graphs G and H are uniquely isomorphic — there exists a unique bijection 

/ : V G —* V H that satisfies (u, w) G E G <—» (I(v),I(w)) G E H . 
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Figure A.5: An example of two isomorphic graphs G and H together with the corresponding isomorphic 
itemsets S and T generated by the polynomial-time reduction function / : (G, H) = (S. T), as described in 
the proof of Lemma [4. 2 1 This figure serves as a detailed example of the constructive proof to Lemma 4.2 


In the figure we see that, there are exactly two isomorphisms between G and H, given by I \ and / 2 ; anc 
exactly two isomorphisms J\ and J 2 between S and T, where I± corresponds to J\ and / 2 corresponds to 
./ 2 . For an in-depth explanation of this figure refer to Section Appendix A.2 


Hence, given the satisfiable instance (G, H) and the bijection I : V c , —» V H , in the 
proof of Lemma |4~2 we claim that J : (v,w) —> (I(v),I(w)) satisfies J(S ) = T. One can 


easily check the graphs and itemsets in Figure A.4 to verify the correctness of this claim. 

Now, suppose we are given a bijection J : E G —* E H s.t. J(S ) = T. Clearly for the 
example in Figure |A.4| we have a unique a = I that maps the items in J(S) to the items 
in T. 


Appendix A.2. Figure A.5 


The example presented in Figure A.5 shows two isomorphic graphs and their /- 
corresponding isomorphic itemsets. This example is more explanatory than the one pre¬ 


sented in Figure A.5 because the isomorphisms between the graphs G and H are not 
unique. Using the constructive proof of Lemma 4.2| we see that the graph isomorphism 
Ii corresponds to the itemset isomorphism A and similarly, I 2 corresponds to J 2 . 
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Figure B.6: An example of two isomorphic itemsets S and T together with their ^-corresponding isomor¬ 
phic hypergraphs S and T generated by the polynomial-time reduction function g : (S, T) = ( G , H), as 
described in the proof of Lemma 4.5 This figure serves as a detailed example of the constructive proof to 
Lemma 4.5 In the figure we see that, there is a unique isomorphism between S and T, given by J; and a 


unique isomorphism I between G and H. For detailed explanation of this figure refer to Section Appendix 

EH 


Appendix B. Examples: HGI is II-Hard 

The examples presented in Figures [B.6| and [B.7| dcmonstrate how to apply the polynomial¬ 
time transformation function g (defined in the proof of Lemma [476] ) to an instance {S, T) of 
the SI decision problem to produce an instance (G, H) of the GI decision problem. From 
the examples, it is clear that (S', T) is satisfiable if and only if ( G , H) is satisfiable. Note 
that we focus our attention on itemsets S and T having at least one column with total 
number of l's not equal to two (a proper hyperedge) in their matrix representation (see 
Section [L2] >. 


Appendix B.l. Figure B.6 


The purpose of this figure is to gently introduce the reader into itemset and hyper¬ 
graph representation, as well as to cover the basic idea of the proof of Lemma 4.5 Fig¬ 
ure |B.6 shows two uniquely (given by J) isomorphic itemsets S and T. Following the 


proof of Lemma [43 we clam that a = I : Vq —> Vh gives a graph isomorphism between 
the q-induced hypergraphs G and H from S and T respectively. From the figure, it is clear 
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Figure B.7: An example of two isomorphic itemsets S and T together with their ^-corresponding isomor¬ 
phic hypergraphs S and T generated by the polynomial-time reduction function g : (S, T) = ( G , H), as 
described in the proof of Lemma 4.5 This figure serves as a detailed example of the constructive proof to 
Lemma [4.5| In the figure we see that, there is a unique isomorphism between S and T, given by J ; and two 
isomorphisms I\ and / 2 between the hypergraphs G and II. For detailed explanation of this figure refer to 
Section |Appendix B.2 


that / is a unique isomorphism between G and //; supporting our claim from the proof 


of Lemma 4.5 that G and II are isomorphic if and only if S and T are isomorphic, where 

9 ({S, T» = (G, H). 


Appendix B.2. Figure [fTT] 

Figure B.7| shows a rather complicated scenario of two input itemsets S and T and 
their (/-induced (as defined in the proof of Lemma |4.5[ ) graphs G and H respectively. This 
case is interesting because there is a unique itemset isomorphism J between S and T, 
however there are two (eri = I\ and (j 2 = / 2 ) graph isomorphisms between G and H. 
Note that the line styles of the hyperedges of both (G and II) graphs are matching if and 
only if one hyperedge is mapped to the other as given by J; for example, the hyperedge 
Si in G is mapped to the hyperedge t 4 in H, noting that J(s 1 ) = t 4 . 
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